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Electroencephalogram (EEG) signals in recognizing emotions have several 
advantages. Still, the success of this study, however, is strongly influenced 
by: 1) the distribution of the data used, ii) consider of differences in 
participant characteristics, and iii) consider the characteristics of the EEG 
signals. In response to these issues, this study will examine three important 
points that affect the success of emotion recognition packaged in several 
research questions: i) What factors need to be considered to generate and 
distribute EEG data?, ii) How can EEG signals be generated with 
consideration of differences in participant characteristics?, and iii) How do 
EEG signals with characteristics exist among its features for emotion 
recognition? The results, therefore, indicate some important challenges to be 
studied further in EEG signals-based emotion recognition research. These 
include i) determine robust methods for imbalanced EEG signals data, ii) 
determine the appropriate smoothing method to eliminate disturbances on 
the baseline signals, iii) determine the best baseline reduction methods to 
reduce the differences in the characteristics of the participants on the EEG 
signals, iv) determine the robust architecture of the capsule network method 
to overcome the loss of knowledge information and apply it in more diverse 
data set. 
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1. INTRODUCTION 


Emotions are interactions and behaviors of human psychology which play an important role in 
everyday human social interactions. They usually arise as a response to certain conditions or problems 
representing a certain target to be achieved [1]. Positive emotions have the ability to maintain a person's 
mental state and increase work efficiency. In contrast, negative emotions cause mental state disorders and the 
buildup, at the top of the day, also leads to depression. Moreover, emotions arise spontaneously amid 
physical and physiological changes associated with human organs and tissues such as the brain, heart, skin, 
blood flow, muscles, facial expressions, and voice [2]. It is, therefore, important to acknowledge human 
emotions in an effort to understand human psychological interactions and behavior. There are generally two 
major categories of emotion recognition methods which are: i) physical or external aspects of humans and 
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ii) physiological signals or internal aspects of humans. Meanwhile, the emotions expressed externally are 
usually deliberately hidden within the social environment [2]—[4]. These problems are mostly solved using 
the physiological signals from the central nervous system (CNS) via electroencephalogram (EEG) signals [5]. 
The EEG signals-based emotion recognition has several advantages such as: i) portability, low cost, and ease 
to line up [5], ii) rich spatial, temporal, and spectral data on human affective experiences which support the 
underlying neural mechanisms [6], [7], and iii) the occurrence of emotional reactions first in the human brain, 
especially within the subcortical, which means it is possible to directly reflect the changes in EEG signals in 
the human emotional condition [4], [6]. The EEG signals-supported emotion recognition studies have been 
widely applied within the 2 problem domains: medical and non-medical [8]. 

There has been rapid development of research on EEG signals-based recognition over the past five 
years in terms of data acquisition, data preprocessing, feature extraction, feature representation, and 
classification process [2], [7]-[11]. The success of this study, however, is strongly influenced by: 
i) distribution of the data used [9], ii) consider of differences in participant characteristics, such as personality 
traits, intellectual abilities, and gender in emotion reaction [12], [13], and iii) consider the characteristics of 
the EEG signals such as having a low frequency and containing spatial information on emotion recognition 
[14], [15]. In response to these issues, the research presented here will examine three important points that 
affect the success of emotion recognition packaged in several research questions: i) what factors need to be 
considered to generate and distribute EEG data to represent emotional reactions?, ii) how can EEG signals be 
generated with consideration of differences in participant characteristics?, and iii) how do EEG signals with 
characteristics exist among its features for emotion recognition? Therefore, the findings of this study are 
expected to be a reference for further research on emotion recognition based on EEG signals. 


2. RESEARCH METHOD 
This literature study was based on several articles retrieved from www.scopus.com, and the articles 
collected them through the two stages explained in the following subsections. 


2.1. Selection stage 

Several criteria based on the query were applied in the searching process in electronic databases, as 
shown in Figure 1. The retrieving process based on these queries results in 316 articles consisting of 171 
conference papers and 145 journal articles. 


2.2. Analysis stage 

The next process is to analyze the articles obtained, and the process involved five stages: i) stage 1, 
focusing on the EEG signals and emotion recognition by checking the title and abstract, (ii) stage 2, checking 
the access of the articles, iii) stage 3, focusing on the three issues of the study by checking each article's 
introduction and methods, and iv) stage 4, Select the relevant article by checking the results and conclusion 
of each article. In Figure 2, the stages of selection and analysis of several articles are presented. 


: H Scopus Search Sources Lists SciVal 7 UGM's Library Catalogue 7 © Aa m] © 


316 document results 


TITLE-ABS-KEY ( eeg AND emotion AND recognition AND method) AND (LIMIT-TO (SRCTYPE, "j") OR LIMIT-TO(SRCTYPE, "p")) AND (LIMIT-TO ( PUBSTAGE , “final” ) ) 
AND (LIMIT-TO ( DOCTYPE , “ar ) OR LIMIT-TO (DOCTYPE, “cp")) AND ( LIMIT-TO ( SUBJAREA, "COMP" ) ) AND (LIMIT-TO ( PUBYEAR , 2020) OR LIMIT-TO ( PUBYEAR , 
2019 ) OR LIMIT-TO (PUBYEAR , 2018) OR LIMIT-TO (PUBYEAR , 2017) OR LIMIT-TO ( PUBYEAR , 2016) ) AND (LIMIT-TO (LANGUAGE , “English” ) ) 


Figure 1. The query for searching articles 


Stage 2 Stage 3 Stage 4 
216 articles 124 articles 52 articles 
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Figure 2. Selection and analysis stages of articles 
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Based on the analysis stage, 52 relevant articles were obtained as references in this study. Apart 
from the 52 articles obtained from searching the Scopus database, this study also uses several additional 
articles totaling 37 articles to enrich the research study. So, this study uses 89 articles. The distribution of 
articles is based on three research questions (RQ) in this study are represented in Table 1. In distributing 
articles on each RQ, article redundancy can occur because each article answers more than one RQ. 


Table 1. Analysis of the articles 


No Research questions Selected articles 
1 RQ1 25 articles 
2 RQ2 27 articles 
3 RQ3 44 articles 
Total 96 articles 


3. RESULTS AND DISCUSSION 
This study reviewed several issues associated with EEG signals-based emotional recognition, which 
can use for further research. 


3.1. RQ 1: What factors need to be considered to generate and distribute EEG data to represent 
emotional reactions? 

Several factors are taken into consideration in generating the EEG dataset for emotion recognition, 

including [11]: 

a) Stimulus media: The literature studies showed several categories of stimuli to evoke emotions such as 
audio [16], [17], visual, and audio-visual media [14] as well as others including the ambient assisted 
living (AAL) technology [18], a combination of music, video, and game stimuli [19], mobile learning 
application [20], augmented reality (AR) [21], virtual reality (VR) [22], [23], and tactile enhanced 
multimedia (TEM) [24]. 

b) Proper stimuli presentation setup [11]. Several factors influencing the presentation of a stimulus, 
including the monitor screen size, lighting, viewing angles. Viewing distance and each of them is 
represented in the (1): 


9 = —14 + 70x, + 2x, — 0.0015x? + 0.46x2 (1) 


where, f is the prediction of the preferred viewing distance (millimeters), x, represents the TV monitor 
size (inches), X2 represents the illumination value in the room (lux), while x3 represents the viewing 
angle (degrees °). 

c) Standardization of experimental protocols [11]. The stimulus presentation in experimental design is an 
important factor influencing the type of emotion it evokes. Therefore, the general implementation 
protocol to extract emotions is explained in Figure 3. 


Post-stimuli 
(in minutes/seconds) 


M 


Figure 3. Experimental design [11] 


where, R is a relaxation time or blank screen condition, C is a countdown frame, W is a white 
cross-presentation/baseline/normal state, S is a presented stimulus, and M is a self-assessment manikin 
(SAM)/rest time assessment. 
Several studies have provided publicly available emotional datasets that other researchers can use, 
such as: DEAP [25], ASCERTAIN [26], GAMEEMO [27], DREAMER [28], MPED [29], SAFE [30], 
AMIGOS [31], MAHNOB-HCI [32], and SEED-IV [33]. Although some datasets are publicly available, 
these datasets may have an unbalanced distribution of data. The performance of the emotion recognition 
method depends on the data balance. Several studies that used public datasets such as DEAP were observed 
to have a high imbalance of emotional data. It was discovered that only respondents 16, 28, 30, and 31 had 
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balanced data for the arousal emotion label out of the 32 analyzed, while 10, 14, 15, 22, and 32 had for 
emotional valence label [34]. There are several oversampling methods for data imbalance problems, such as 
adaptive synthetic sampling approach for imbalanced learning (ADASYN) [34], a novel fitness function, 
g-score genetic programming (GGP) [35], and synthetic minority oversampling technique (SMOTE) 
[36], [37]. However, most of the existing oversampling methods still found overlapping data in the final 
results, making it difficult to determine the decision limit for each class. The radius-SMOTE method can 
overcome this problem. This method emphasizes the initial selection approach by generating synthetic data 
based on the safe radius distance. However, the radius-SMOTE method has limitations in detecting noise in 
the data boundary area [38]. Therefore, the challenge for future studies is to determine robust methods for 
imbalanced EEG signals data. 


3.2. RQ 2: How can an EEG signal be generated with consideration of differences in participant 
characteristics? 

The participants’ emotional reactions in EEG signals are strongly influenced by the different 
characteristics of participants, such as personality traits, intellectual abilities, and gender [9], [12]. The 
different characteristics of these participants can produce unique EEG signals patterns. Several studies have 
examined the use of baseline EEG signals to consider the different participant characteristics on experimental 
signals [39], [40]. It is important to note that the baseline EEG signals represent a calm state before a 
stimulus medium is given [25], [28], [31], [41], [42]. The steps of the baseline approach are cutting all the 
C channels in the baseline signals into several N segments with length L, and each segment is averaged to 
obtain the BaseMean value using the (2). Furthermore, the baseline reduction process on the EEG test signal 
is carried out by subtracting the value of the EEG test signal from the baseline EEG signal value using the 


(3). 


Yi, BaseSignal; 
BaseMean = N (2) 


Clean_EEG; = Trial_EEG; — BaseMean (3) 


The Clean_EEG; signals are an EEG signal that represents a subjective emotional reaction 
according to a given media stimulus. Several baseline reduction methods are applicable to characterize 
signals data, such as the difference, relative difference, and fractional difference methods. However, they 
have been observed to be effective with only black tea aroma data [43]. However, the tea aroma has similar 
characteristics to the EEG signals data, such as containing a lot of noise and weak frequency intensity. 
Therefore, it is a challenge for future research to test the three baseline reduction methods suitable for use in 
EEG signals data. 

The baseline signals approach has increased emotion recognition accuracy compared to without 
using the EEG baseline signals approach [40], [44], [45]. This approach also significantly increases the 
accuracy of recognizing 2 classes of emotions (arousal and valence) and 4 classes of emotions (high arousal 
positive valence; high arousal negative valence; low arousal negative valence; and low arousal positive 
valence) [46]. Other studies have also been proposed a correlation approach to determine the baseline signals 
that has a high correlation with the stimulus medium [47]. This approach can overcome cross-subject 
emotion recognition. Although the baseline EEG signals approach has produced high accuracy, this approach 
is strongly influenced by the quality of the baseline EEG signals [9]. Recording the baseline EEG signals that 
are free from external, internal, and disturbances originating from the participants’ emotional reactions isn’t 
easy to do even though the participants are in a calm state [39], [48], [49]. This disturbance causes the 
baseline EEG signals to be unable to characterize the differences in participant characteristics found in the 
EEG signals. There are several methods applicable to eliminate disturbance/artifacts in the EEG signals, 
including regression [50], wavelet transform [51], and blind source separation (BSS), which further include 
other techniques such as independent component analysis (ICA) usually applied for electrooculography 
(EOG) artifacts [5], [29] and eye blinking [52]. This ICA also has the ability to remove artifacts using 
statistical independence between EEG and artifacts [10]. Another method is the principal component analysis 
(PCA) used to analyze EEG intervals not contaminated artifacts by extracting eigenvalues and eigenvectors 
corresponding to the clean EEG signals [45]. Meanwhile, the signals mixed with the eye blink are usually 
decomposed into a series of intrinsic mode functions (IMFs) [2]. Most artifacts removal algorithms offer 
good performance, but this method only focuses on detecting and removing specific artifacts such as EOG, 
ECG, and EMG [50]. 

Another method that applies to eliminate external and internal interference on the EEG signals is the 
smoothing method [53], and the following are several smoothing methods, including mean filter, median 
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filter, Savitzky-Golay filter, discrete Kalman filter, the Gaussian filter, and kernel density estimation kernel 
[54]. The process of smoothing the EEG data signals can smooth the fluctuations of the EEG signals and 
avoid the outlier EEG signals [55], [56]. Therefore, the next research challenge is determining the appropriate 
smoothing method to eliminate external or internal disturbances and emotional reactions in the baseline EEG 
signals and to study the best baseline reduction methods to consider the differences in the characteristics of 
the participants on the trial signals. 


3.3. How do EEG signals with characteristics exist among its features for emotion recognition? 

EEG signals have several important characteristics that need to be considered in emotion 
recognition, such as low frequency and spatial information. Several studies have identified some of these 
characteristics in feature extraction, feature representation, and the classification process [2], [7]—[11]. 


3.3.1. Feature extraction 

This is usually used to obtain features relevant to the emotional state of the EEG signals, and the 
process is grouped into 3 as [8], [29]: 

a) Time domain feature. This is based on the time domain of a signal, and some of it has been reviewed in 
previous studies, such as the mobility, complexity, and activity using Hjorth parameters [57], fractal 
dimension using the Higuchi method [58], [59], event-related potentials (ERP) features [60], and 
statistical feature [61]. 

b) Frequency domain feature. This is based on the frequency domain of a signal, and several features have 
been reviewed in previous studies such as power spectral density (PSD) [62], band power using wavelet 
transform [59], [63], mel-frequency cepstral coefficients (MFCCs) technique [64], and differential 
entropy (DE) [14], [15], [40], [65], [66]. 

c) Time-frequency domain feature. This is based on the time-frequency domain of a signal, and some 
examples reviewed in previous studies include short-time fourier transform (STFT) [67], discrete 
wavelet transform (DWT) Features [58], and Combination of statistical and fast Fourier transform 
(FFT) methods [68]. 

The differential entropy (DE) method has, however, been discovered to have the ability to 
distinguish high energy and low energy patterns from EEG frequencies [14] and also to characterize spatial 
information from EEG signals [15]. The features generated from the DE method are the most accurate and 
stable in emotion recognition compared to several others such as autoregressive parameters, fractal 
dimension, power spectral density (PSD), differential asymmetry (DASM), rational asymmetry (RASM), 
asymmetry (ASM), differential caudality (DCAU), wavelet features, and sample entropy [40], [65], [66]. The 
DE formula usually used to characterize an EEG signal is defined as (4) [40]. 


A(X) = iN f(X) log(f(x))dx (4) 


Where, X is a random variable and f(x) is the probability density function of X. Meanwhile, the DE of the 
series X obeying the Gauss distribution N (u, 5”) is expressed as (5): 


NGS eee he ates ae |e anes 
— ——. ô bem ô =- 
(X) lapat 2 og Test’ 2 x = og(2767) (5) 
for a given frequency band i, the DE is defined as (6): 
hi(X)=slog(27e6) (6) 


where, e is Euler’s constant (2.71828), 6% the variance of the signal, and h; represents the DE of the 
corresponding EEG signals in the frequency band. 


3.3.2. Feature representation 

It is important to determine the appropriate method to represent the features of the EEG signals due 
to their spatial information characteristics. Some of the representation methods used in previous studies 
include the multiband feature matrix (MFM) [62], 2D mesh [69], maximal information coefficient (MIC) 
[70], and 3D cube [40]. The 3D Cube method can maintain spatial information between channels as well as 
frequency bands, including theta, alpha, beta, and gamma. It is based on the channel representation of the 
international system 10-20 mapped into a 9x9 matrix [40]. The 3D cube representation also inspires computer 
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vision through three basic colors, including red, green, and blue (RGB). These RGB color channels have a 


value range of 0 to 255 which indicates the intensity of the color in each channel. The DE features are 
represented in a 3D cube, as indicated in the processes in Figure 4. 


Theta frequency 
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Figure 4. Feature representation based on 3D cube [40] 


Based on Figure 4, every second, the EEG signals data generated from each EEG channel is 
decomposed into 4 frequency bands. Next, the feature extraction process is carried out for each frequency 
band. The feature value of each frequency band is then mapped into a 9x9 matrix so that it will produce 4 
matrices. In the last stage, the 4 matrices are combined into a 3D cube [40]. The feature representation in the 
image is compared with the feature representation in the EEG signals in Table 2. The 3D cube-based feature 
representation has the ability to maintain spatial information between channels and also integrate between the 
frequency bands [40]. 


Table 2. Feature representation of images and EEG signals [40] 


Domain 
Representation in computer vision Representation in EEG signals 
Term Color image EEG 3 cube 
Color channel (R, G, B) Frequency band (0, a, B, y) 
Color Intensity Differential entropy feature 


3.3.3. Classification process 

The classification is the main process of emotion recognition important to be studied in addition to 
the feature extraction and representation processes. There are, however, two approaches to the classification 
of emotion through EEG and these include the machine learning and neural network approaches. 

a) Machine learning approach: Some of the methods usually applied include decision tree (DT) [71], naive 
bayes (NB) [72], quadratic discriminant analysis (QDA) [73], k-nearest neighbors (KNN) [58], [74], 
[75], linear discriminant analysis (LDA) [14], relevance vector machines (RVM) [67], xtreme gradient 
boosting (XGBoost) [76], support vector machine (SVM) [77]-[79], AdaBoost [80], logistic regression 
via variable splitting and augmented lagrangian (LORSAL) [81], random forest (RF) [56], [82], and 
graph regularized extreme learning machine (GELM) [83]. 

b) Neural network approach: This method include artificial neural network (ANN) [61], [63], [84] deep 
belief networks [70], [85], convolutional neural network (CNN) [40], [46], [86], [87], long short-term 
memory (LSTM) [66], generative adversarial networks (GAN) [88], capsule network (CapsNet) [45], 
[62], and hybrid methods [4], [44], [69]. 

Based on the articles obtained from the Scopus database from 2016-2020, CNN and SVM methods 
have been the most studied for emotion recognition based on EEG signals. In Figure 5, the distribution of 
several methods used for emotion recognition based on EEG signals is presented. Some deep learning 
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methods, however, have superior accuracy compared to machine learning methods. The following 
summarizes the achievement of the highest accuracy of several deep learning methods in the classification of 
emotions based on EEG signals, as shown in Table 3. 

Although the CNN method has slightly outperformed the capsule network method on the 
DREAMER dataset, however, the capsule network method has several advantages in recognizing emotions 
based on EEG signals, such as its ability to: (i) effectively characterize the spatial relationships between 
different features [89] and (ii) to be trained individually effective on a much smaller data scale compared to 
CNN [45]. Figure 6 shows the structure of the capsule network method generally consists of several parts, 
which include the following: 


Distribution of EEG signal-based emotion 
classification methods. 


15 
10 | 
5 
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Figure 5. Distribution of EEG signals-based emotion classification method 


Table 3. Comparison of the accuracy of deep learning methods 


No Methods Emotion classes DEAP dataset DREAMER dataset AMIGOS dataset 
1 MLF Capsule 2 emotional classes 97.97% for high/low 94.59% for high/low - 
Network [45] valence; 98.31% for valence; 
high/low arousal 95.26% for high/low 
arousal 
2 RACNN [86] 2 emotional classes 96.65%; for high/low 95.55% for high/low - 
valence; 97.11% for valence; 
high/low arousal 97.01% for high/low 
arousal 
3 3D-CNN [46] 2 emotional classes 96.43% for high/low - 96.96% for high/low 
valence; 96.61% for valence; 
high/low arousal 97.52% for high/low 
arousal 
4 3D-CNN [46] 4 emotional classes 93.53% (high arousal - 95.95% (high arousal and 
and positive valence; positive valence; high 
high arousal and arousal and negative 
negative valence; low valence; low arousal and 
arousal and negative negative valence; and low 
valence; and low arousal arousal and positive 
and positive valence) valence) 
16 
ReLUConvi 256 = DigitCaps 
; — TA 
j E PrimaryCaps 


Figure 6. Capsule network architecture [89] 
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a) Convolutional section where convolution process is conducted on the input data matrix using the ReLU 
activation function to produce the feature map to be used as the data input for the PrimaryCaps. 

b)  PrimaryCaps section consists of four processes, including: i) convolution, ii) concatenate, iii) 
bottleneck, and iv) reshape. The reshaping process, however, generates the vector data uj, which 
represents the input vector value of the lower capsule i). 

c) DigitCaps section includes several processes, including the following: 

— The affine transformation process aims to represent the spatial relationship between the sub-objects of 
the total objects at a higher layer. This is further used to predict the correlation of these sub-objects with 
objects at higher levels. The vector Uu; and matrix Wi j are multiplied to produce vector Îjli where j 


represents the index of each class output. 
dji = Wijui (7) 


— The weighted sum process was conducted based on the multiplication of the Cj with the input vector 
tj |; to produce vector Sj. 


Sj = Li Cy Bjii (8) 


The Cj is determined using a dynamic routing algorithm that iterates several times to generate Cj values 
(by default three times), as indicated in Table 4. 


Table 4. Dynamic routing algorithm [89] 


Dynamic routing algorithm 


1: procedure ROUTING (fj, r, 1) 

2: for all capsule i in layer 1 and capsule j in layer (1 + 1): by€ 0 
a for r iterations do 

4: for all capsule i in layer 1: C © SoftMax function (bij) 

5: for all capsule j in layer (1 + 1): S; € XiCijûj 

6: for all capsule j in layer (1 +1): V € squash function (Sj) 

7: for all capsule i in layer 1 and capsule j in layer (1+1): by ©bjyt+tjj-V; 
8: return Vj 


The process aims to project several predictive vectors Gj) using the coupling coefficients (Cij) in order 
to produce the weighted sum value (Sj). 


— The squashing process generates an output vector v; for each class at a higher level (/+1) and the 
maximum or highest value of v; determines the predicted class level. The squashing function to obtain 
the probability value in each prediction class is, therefore, represented using the Formula (9). 


S; 


= sill’ a ae 


1+ lsi isl (9) 


d) The loss function calculation section was used to calculate the loss value based on the output and target 
values using the L2 regularization method. 


Le = T.max(0,m* — ||vell)? + 4 (1 — Te)max(0, || vel] — m)? (10) 


To is equal to 1 if the emotion class matches the target at e, m~=0.1 and m*=0.9, and is the 
down-weighting of the loss function. By default, A=0.5, and Ve represents the output vector of class e. 

The capsule network method has several advantages over the others. Still, it allows the loss of 
knowledge information within the convolution process to work out feature maps and requires higher 
computation time than other deep learning methods [45]. This means the next research challenge is 
determining the acceptable architecture of the capsule network method to overcome the loss of knowledge 
information in the primary capsule. Moreover, it is also crucial to study the new architecture of the capsule 
network method to overcome the high computation time in the classification process. Considering that each 
emotion dataset has different characteristics, such as the number of channels used, the number of 
respondents, and the experimental strategy, for further research, it is also important to study the capsule 
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network method in a more diverse data set and introduce 4 emotion classes. This study aims to obtain a more 
robust capsule network architecture on different datasets. 


4. CONCLUSION 

Although various studies have been conducted to overcome the three issues of emotion recognition 
based on EEG signals, there are several challenges to further study in the next research, include: i) determine 
robust methods for imbalanced EEG signals data, ii) determine the appropriate smoothing method to 
eliminate external or internal disturbances and emotional reactions in the baseline signals, iii) determine the 
best baseline reduction methods to consider the differences in the characteristics of the participants on the 
trial signals, and iv) determine the robust architecture of the capsule network method to overcome the loss of 
knowledge information and apply it in a more diverse data set. These challenges, in the future, are expected 
to produce robust models in emotion recognition based on EEG signals. This research study, however, has 
some limitations regarding the number of articles used. Therefore, further research needs to expand the scope 
of the emotion recognition domain, such as year, and topic. 
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